NHANES %>% filter(Gender=="female") %>%
ggplot(aes(x=DirectChol)) +
geom_histogram(aes(y=..density.., fill=..count..),bins=30) +
geom_density(aes(y=..density..))Select females and pipe results to ggplot
NHANES %>% filter(Gender=="female")Select data to plot
ggplot(aes(x=DirectChol)) +
Equal bins for interpretation, number of bins can be selected with the bins argument to the geom_hist
Relative frequenties to enable visual comparison between histograms
geom_histogram(aes(y=..density.., fill=..count..)) +
geom_density(aes(y=..density..))
With ggplot we always have to define an x variable if we make a boxplot. If we use a string then all data is put in one category. And one boxplot is constructed.
NHANES %>% filter(Gender=="female") %>%
ggplot(aes(x="",y=DirectChol)) +
geom_boxplot() So we can add a boxplot to a ggplot figure by using the geom_boxplot function.
If the dataset is small to moderate in size we can also add the raw data to the plot with the geom_point() function and the position="jitter" argument. Note, that we then also set the outlier.shape argument in the geom_boxplot function on NA so that the outliers are not plotted twice.
Here, we will plot again the relative abundances of Staphylococcus from the armpit transplant experiment
ap<-read_csv("https://raw.githubusercontent.com/GTPB/PSLS20/master/data/armpit.csv")
ap# A tibble: 20 x 2
trt rel
<chr> <dbl>
1 placebo 55.0
2 placebo 31.8
3 placebo 41.1
4 placebo 59.5
5 placebo 63.6
6 placebo 41.5
7 placebo 30.4
8 placebo 43.0
9 placebo 41.7
10 placebo 33.9
11 transplant 57.2
12 transplant 72.5
13 transplant 61.9
14 transplant 56.7
15 transplant 76
16 transplant 71.7
17 transplant 57.8
18 transplant 65.1
19 transplant 67.5
20 transplant 77.6
ap %>% ggplot(aes(x=trt,y=rel)) + geom_boxplot(outlier.shape=NA) + geom_point(position="jitter")When we specify a factor variable for x, we get a boxplot for each treatment group.